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criteria available for concurrent validation. The chapter discusses 
data from a variety of studies on Goal Attainment Scaling in an 
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approach when applied to the methodology. The findings underscore the 
idea that Goal Attainment Scaling can be applied in a variety of 
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As of January, 1974, the Program Evaluation Project is funded by a three year collaborative grant 
fith the Mental Health Services Division of the National Institute of Mental Health. The purpose of the 
rant is to emphasize the coordination and dissemination of inform.;tion on a variety of program evaluation 
ethodologies. Currently, it is expected that the title of the organization will be changed to the Program 
valuation Resource Center during 1974. 

Further information on the Goal Attainment Scaling methodology and program evaluation is available in 
ther written and recorded materials from the Program Evaluation Project office. Chapter One, "Basic Goal 
ttainment Scaling Procedures", Chapter Three, "An Introduction to Reliability and the Goal Attainment 
caling Methodology", and Chapter Nine, "Evaluation of the Adult Outpatient Program, Hennepin County Mental 
5alth Service" of the P.E.P. Report 1969-1973 are now available. Additional chapters will be released 
lis year as they are completed. 



SYNOPSIS FOR CHAPTER FIVE 
A CONSTRUCT VALIDITY OVERVIEW OF GOAL ATTAIN^ENT SCALING 



PURPOSE : The establishment of validity is one of the major tasks of the developers of a measurement 
methodology. In this chapter, it is argued that tl construct validity approach s t sential to an under- 
standing of the validity of Goal Attainment Scaling, since there are no clear criteria available for con- 
current validation. 

The chapter discusses data from a variety of studies on Goal Attainment Scaling in an effort to il- 
lustrate various facets of the construct validity approach when applied to the methodology. The follow- 
ing chapter presents the results of one particular study of the validity of Goal Attainment Scaling. 



MAJOR FINDINGS : It is emphasized here that there have been many approaches to validity, but the Cronbach- 
Meehl concept of construct validity seems to be the most inclusive. If there is a basic construct under- 
lying Goal Attainment Scaling it is the "attainment of expectations", but this major construct is ac- 
companied by many other variables related to the many possible ways in which Goal Attainment Scaling can 
be applied. 

* list of these accompanying variables and a system for illustrating hypotheses, about Goal Attainment 
Scaling are presented. This system is utilized to present several findings, the clearest being that as 
predicted, the Goal Attainment score is not significantly related to client characteristics such as age, 
sex, education, marital status or intelligence. In one study of ac*ult outpatients and day treatment cases. 
Goal Attainment scores based o:. therapist scoring were correlated from a .58 to .84 with two questions of 
global ratings of treatment outcomes answered by the therapists. The correlations of the Goal Attainment 
score with the consumer satisfaction Index was .23 for one group with correlations for individual consumer 
satisfaction items ranging from -.12 to .46. The Goal Attainment score was shown to be correlated .31 
with predictive accuracy for one group of adult outpatients. In general, correlations with other measures, 
as expected, are positive, but low to moderate, with early results from the drug effectiveness study, for 
example, where all concurrent validity coefficients were .52 or less. 

In general, the original Kiresuk-Shennan liypotheses about Goal Attainment Scaling are supported. The 
findings underscore the idea that Goal Attainment Scaling can be applied in a variety of settings. 
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The two central concepts in the analysis of 
measurement systems have historically been valid- 
ity and reliability. The reliability of Goal At- 
tainment Scaling is discussed at some length in 
Chapters Three and Four of the P,E,P. Report 1969- 



This chapter presents suggestions for approach 
ing the study of the "validity" of Goal Attainment 
Scaling. The Introduction is an attempt to link 
traditional psychometric comments on validity, 
especially construct validity, to the special 
characteristics of Goal Attainment Scaling. 
Section I is an outline of the variables avail- 
able for validity studies of Goal Atta'nment 
Scaling by the Program Evaluation Project staff. 
Section II discusses briefly the generation of 
hypotheses related to Goal Attainment Scaling. 
Section III shows Program Evaluation Project 
data applied to issues in Goal Attainment 
Scaling validity. 



Introduction 



As Ebel (1961) comments, validity is an im- 
precise term due to "logical and operational 
limitations of the concepts of validity itself". 
He argues that "...faster progress will be made 
toward more educational and psychological tests 
if validity is given a much more specific and re- 
stricted definition than is usually the case, 
and if it is no longer regarded as the supremely 
important quality of every mental test". 
The classic statement on validity is Lindquist's 
(1942) observation that "the validity of a test 
may be defined as the accuracy with which it 
measures that which it ^'s intended to measure 
or as the degree to which it approaches infalli- 
bility in measuring what it purports to measure". 
Loevinger (1957) argues that "this definition is 
too vague, too remote from factual measuring 
operations, to be useful ..." Ebel also argues 
that this definition is too qeneral and not 
particularly useful. He proposes that these 
points of test "quality" be examined instaad: 

1. The importance of the inferences that 
can be made from the test scores. 

2. The meaningfulness of the test scores, 
based on : 

a. An operational definition of the 
measurement procedure. 

b. A knowledge of the relevance of 
the scores to other measures, from, 

i. Validity coefficients, pre- 
dictive and concurrent. 

ii. Other correlation coefficients 
or measures of relationship. 



c. A good estimate of the reliability of 
the scores. 

d. Appropriate norms of examinee perform- 
ance. 

3. The convenience of the test in use. 

Ebel concludes that these concepts ansv/er the 
primary questions about the utility of a test, and 
that points 2a, 2b-i , and 2b-ii need not be estab- 
lished for all tests. 

A more stringent approach is presented by 
Raymond Cattell (1964) who argues that some very 
mathematically inclined "psychometricians have some- 
times seemed lost in their labyrinthine fastnesses 
from logic, from common sense, and certainly from 
psychological perspective". He suggests that valid- 
ity is "...the capacity of a test to predict some 
specified behavioral measure (or set of measures) 
other than itself". This use c; "predict" could be 
taken, for the purposes of outcome evaluative mea- 
sures, in the very broadest sense, such as "a high 
score for treatment X on measure Y for group Q pre- 
dicts a high outcome for other groups similar to 
group Q who receive treatment X". In any case, Cat- 
tell specifies three parameters of validity: 

1. "Degree of Abstraction of the Referent 
Criterion." This parameter might also be 
called, the general izability of results, 
with one extreme being a very concrete test, 
such as "ability to clean armadillos quick- 
ly', and the other extreme being a very con- 
ceptual test, such as "creativity". 

2. "Degree of Naturalness of the Criterion." 
This parameter varies from a crit*^ ">n 
based on a "Natural" situation su.. as 
coping with a real disaster, to correlation 
with an "Artificial" criterion situation 
such as another test. 

3. "Degree of Directness of Validation." This 
parameter varies from "direct" or comolete 
correlation with the criterion to "vudir.'ict 
or circumstantial" correlation with the 
criterion. The directress refers to the 
patterns by which variables are correlated. 
For example, two measures could both be 
correlated .3 w^h criterion X, and vet me^i- 
sure #1 could correlate with only one aspect 
of criterion X, whereas measure #2 could re- 
late to an entirely different aspect of cri- 
terion X. 

After introducing these three parameters. Cat- 
tell discusses some of the concepts frequently called 
validity". He maintains that there are a group of 
utility coefficients" which are not validity co- 
efficients at all and includes "face validity", "con- 
tent validity" and "semantic validity" in this group 
of non-validity concepts. Mosier (1947), writing 
much earlier, also recomnends that "face validity" be 
dropped from the vocabulary because it has too many 
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confusing and vague definitions. Cattell (1964) 
appears to see the "...integration of psychomet- 
rics with personality theory and general psycho- 
logical theory..." as an ultimate goal for im- 
proving validity conceptualizations, in much the 
same way that Loevinger (1957) stresses it. In 
fact, Loevinger maintains that "content validity 
is established by the judgment of the investigator 
that the items are valid; it is thus also contin- 
gent upon a special, non-generalizable circum- 
stance, to wit, the particular investigator.... 
Since ad hoc arguments are scientifically of minor 
"mportance, if not actually inadmissable, what is 
left, construct validity, is the whole of the sub- 
ject of a systematic scientific point of view". 

The viewpoints presented above reveal the 
range in theoretical thinking about validity. 
In this discussion, construct validity is 
utilized as a basis for the presentation, since 
it allows for flexibility and subsumes mam 
earlier "forms" of validity. 

This approach to the validity issue is based 
on Cronbach and Meehl's (1955) version of the con- 
cept of "construct validity", which they added to 
earlier "types" of validity: such as concurrent, 
content, and predictive. It is very germane to 
Goal Attainment Scaling theory that Cronbach and 
Meehl observe (evidently citing Gaylcrd), "When 
an investigator believes that no criterion 
available to him is fully valid, he perforce be- 
comes interested in construct validity because 
this is the only way to avoid the Infinite 
frustration of relating every criterion to some 
more ultimate standard... Construct validity 
must be investigated whenever no criterion or 
universe of content is ^ccepted as entirely 
adequate to define the quality to be measured." 

This lack of a clear-cut criterion for com- 
parison is a central issue in the discussion of 
Goal Attainment Scaling validation. There is no 
clear-cut criterion for either mental health or 
therapy effectiveness in mental health. With the 
use of Goal Attainment Scaling, in effect, a new 
criterion for mental health treatment is selected, 
that is, "the degree to which expectations for 
treatment are achieVed." The expectations may be 
developed by the clinicians, the clients, other 
persons, or some combination of involved pe^^'ons. 
Since this fom^ of criterion is so new, no in.- 
mediately applicable standard for comparison with 
Goal Attainment Scaling has been located. Al- 
though some instruments may be suitable for 
examining certain aspects of Goal Attainment 
Scaling, none have been identified which are as 
comprehensive or "expectation-oriented" as 
Goal Attainment Scaling. 

If a clinician developed the expectations as 
appearing on a particular follow-up guide, the 
result could taken theoretically as a one per- 
son sample of the expectations for treatment which 
would be imposed by the hypothetical average clin- 
ician. In any case, as noted above, the use of 
individualized expectations as a criterion is so 
unusual that there are no -^asily applicable con- 
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current measures and a construct validity approach 
appears essential . 

The construct, for Meehl and Cronbach, is 
"...some postulated attribute of people assumed 
to be reflected in test performance.... A con- 
struct has certain associated meanings carried 
in statements of this general character: Per- 
sons who possess this attribute will, in situ- 
ation X, act in manner Y (with a stated prob- 
ability)." It is noteworthy that, even in this 
very pertinent article, the language of psycho- 
metrics is not well -adapted to outcome evaluation 
in general, or Goal Attainment Scaling in particular. 
For pu'^poses of the discussion of Goal Attainment 
Scaling, the above phrase would have to be changed to 
read, perhaps, "Agencies, treatment modes, or persons 
who possess this attribute will, in situation X, act 
in manner Y or will have acted in manner Y (with a 
stated probability) 

Despite their relatively straightforward def- 
inition of a "construct" the authors emphasize that 
there is no single, simple coefficient of construct 
validity, but that construct validity must be thought 
of in terms of a "nomological net" of constructs 
linked by testable hypotheses. Meehl and Cronbach 
(1955) delineate construct validity by a series of 
axioms: 

1. Constructs of varying degrees of defini- 
tiveness are defined by a network of 
propositions . 

2. There must be predicted relationships 
among variables. 

3. The network must be explicit. 

4. Many types of evidence are relevant to 
construct validity and both high and low 
correlations may be useful evidence for 
the proposed nomological net. 

5. When a predicted relationship fails to be 
observed, the network of constructs must 
be redefined. 

6. There is no simple coefficient of construct 
val idity. 

7. General scientific procedures are used. 

Cronbach and Meehl's discussion, like that of 
most of the authors mentioned before, is based large- 
ly on the concept of the trait or characteristic. An 
evaluation technique, however, is not usually aimed 
at measuring a trait of a person, but rather at mea- 
suring a pattern of ^hanges or effects in an agency 
(which may be measured either for the agency as a 
whole or through the sum of effects on a number of 
persons). Wiggins (1973) describes a "trait" as 
"...a hypothetical construct which provides an organ- 
izing principle for relating a variety of superfi- 
cially dissimilar behaviors under a single dispo- 
sitional unit". If there is a trait underlying the 
Goal Attainment score, it would be "the tendency to 
attain expectations" as noted above. The Goal 
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Attainment expectations could be set by one more 
of a number of sources, such as the client, the 
therapist, and so on. Clearly, even if Goal At- 
tainment Scaling can be linked to trait concepts, 
the methodology demands a loose application of 
many validity ideas. 

These experts on psychometrics have been cited 
above to suggest the flexibility of the "validity" 
concept. Possibly the most basic characteristic 
of validity is that, as Nunnally (1967) says, 
"Validity is a matter of degree rather than an 
all-or-none property, and validation -is an unending 

process Strictly speaking, one ' lidates not 

a measuring instrument, but rather some use to 
which the instrument is put." Nunnally concludes 
that, "... sufficient evidence for construct valid- 
ity is that the supposed measures on the instrument 
. . . behave as expected " . 

This general commentary on validity has been 
used to introduce a few construct-oriented guide- 
lines. Possible variables related to the "attain- 
ment of expectations" construct underlying Goal 
Attainment Scaling are outlined in the first two 
sections. Then in the third section, findings on 
validity will be summarized for Goal Attainment 
Seal ing. 



I. Variables Relevant to Goal Attainment Scaling 
Construct Val idity 

As observed above. Goal Attainment Scaling is 
not a typical "trait-type" problem in validity. 
Instead, the methodology involves a collection of 
characteristics and activities which can be inter- 
connected by predictions and hypotheses. It is 
the utility of Goal Attainment Scaling in evalu- 
ation methodology which must be examined in the 
contest of a network of constructs. 

EbeVs three points related to an instrument's 
"quality" or validity wen=5 presented in the first 
page of this discussion. His divisions of 
"quality" seem appropriate to the considerations 
of the practical issues of validating an outcome- 
oriented methodology like Goal Attainment Scaling. 
Thus, in the third part of this chapter where 
construct validity issues are empirically examined, 
Ebel's outline will be followed as a matter of 
convenience. 

This presentation is not intended to be a 
contribution to the reliability versus validity 
controversy. Ebel (1961), as mentioned previously, 
includes reliability as a basic factor in test 
"quality". Nunnally (1967) states that "... con- 
sistency is a necessary but not sufficient con- 
dition for construct validity". Therefore, re- 
liability is subsumed in part of the following 
discussion, although a more thorough description 
of these reliability results is presented in 
Chapters Three and Four or the P.E.P. Repo rt 1969- 
1973. 



In summary, the validity issues for this dis- 
cussion of Goal Attainment Scaling u^'e oriented 
to a construct val idity perspective in which 
basic questions are asked about the methodology's 
util i ty and measurement p roperties in the pro- 
gram evaluation context. (The effectiveness of 
utilizing Goal Attainment Scaling as part of the 
therapeutic process is not within the scope of 
this chapter, except in reference to specific 
measurement studies.) This perspective may in- 
clude correlations of the Goal Attainment scores 
with criteria, a procedure often called "concurrent 
validity", or content analyses which could be re- 
lated to the so-called "content validity", but the 
basic thrust of the discussion is to examine the 
network of relationships between constructs, the 
Goal Attainment scores and other variables. 
Anastasi (1970) comments bluntly, however, that 
"...content, criterion-oriented, and construct 
validity do not correspond to distinct or logical- 
ly coordinate categories. On the contrary, con- 
struct validity is a comprehensive concept, which 
includes the other types." 

The primary construct being examined is "out- 
come" or "attainment of expectations". "Outcome" 
could actually be considered a collection of dif- 
ferent constructs. For example, "outcome after one 
month of treatment" is certainly a different set 
of expectations, and a different construct, with 
different postulated attributes, different predic- 
tions, and a different meaning than outcome at six 
months ." 

The basic components of Goal Attainment Scaling 
procedures may be considered iff terms of character- 
istics and activities. Characteristics are attributes 
of the persons involved in the Goal Attainment Scaling 
situations and activities are considered to be be- 
haviors or procedures of the Goal Attainment Scaling 
process. Both aspects are represented here by the 
data available from Program Evaluation Project data, 
some of which is incomplete. 

Characteristics 

Al. Of the persons (or agency) being repre- 
sented on the Goal Attainment Follow-up 
Guide. 

A2. Of the person(s) constructing the Goal 
Attainment Follow-up Guide. 

A3. Of the person(s) scoring the Goal At- 
tainment Follow-up Guide. 

Activities 

Bl. The rules and procedures used to construct 
the Goal Attainment Follow-up Guide. 

32. The treatment being used in the agency. 

B3. The way in which the Goal Attainment 
results are expressed. 

B4. The way the Goal Attainment Scaling re- 
sults are used. 
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Here are some more specific variables included 
under the seven major components mentioned above: 

Al. The person represented on the follow-up 
guide. (If organizations were being rep- 
resented on the follow-up guide, their 
characteristics would be listed here, such 
as size, income, available evaluative re- 
sources and so on.) 

a. Truthfulness, completeness and abil- 
ity to communicate and/or predict. 

b. Age, sex, intelligence, and other 
variables. 

c. History of treatment. 

d. Problems presented and diagnosis. 

A2. The person constructing the Goal Attain- 
ment Follow-up Guide. 

a. Who is it? the client, a spouse of 
the client, a relative, a clini- 
cian(s), some combination of the 
above persons, or others? 

b. How experienced is the constructor 
at Goa' Attainment Scaling and has 
the constructor been trained in 
Goal Attainment Scaling? 

c. How accurate is the constructor's 
ability to predict outcome? 

d. Personality and demographic character- 
istics of the constructor and/or ed- 
ucational or discipline background. 

A3. The person scoring the Goal Attainment 
Follow-up Guide. 

a. Who is it? the client, the person 
constructing the follow-up guide, the 
person giving the treatment or a sep- 
arate person? 

b. How experienced and skilled is the 
person at scoring the Goal Attainment 
Follow-up Guide? 

c. Personality, educational, demographic 
and discipline characteristics of the 
scorer. (If other than the client?) 

d. How skilled at interviewing is the 
person scoring? 

Bl. Rules and procedures used to construct 
the Goal Attainment Follow-up Guide. 

a. What form of the Goal Attainment 
Follow-up Guide is used? 

b. What rules of construction are used 
and are the follow-up guides re- 
checked to see if the rules are met? 



c. Is the Goal Attainment Follow-up Guide 
aimed at ^ specific date in the future 
and how far off is that date? 

d. Is the level at intake used? 

e. Can scales be differently weighted? 

f. Is the Goal Attainment Scaling semi- 
standardized, standardized or com- 
pletely idiosyncratic? 

s g. Is the treatment to be received and/or 
the person who will be treating knowh 
to the follow-up guide constructor? 

B2. The treatment being used. 

a. What form of treatment is being used 
and what is available? 

b. What rules of treatment choice (e.g. 
random) are used and is their use 
monitored? 

c. Is the treatment limited in time and 
how long does it last? 

d. Is the Goal Attainment Follow-up Guide 
available to the treater or used as part 
part of treatment? 

e. Can treatments be changed during the 
cl ient/treater interaction and how 
often are they changed in practice? 

f. Is there any limit on the type or num- 
ber of problems to be treated? 

B3. The type of measure used to express the 
Goal Attainment results. 

a. Goal Attainment sco"e mean (depends on 
follow-up results). 

i. Kiresuk-Sherman Goal Attainment 
score (varies from 20 to 80). 

ii. Scale-by-scale Goal Attainment 
score, (varies from -2 to +2) 
for either an individual scale 
or the mean for an entire follow- 
up guide. 

b. Goal Attainment score variance (depends 
on follow-up results). 

c. Change score (depends on follow-up 
results and whether the initial status 
of the client is noted on the follow- 
up guide). 

d. Predictive accuracy also called Mean In- 
accuracy Score (depends on follow-up 
results) . 

e. Contents or types of problems included 
on the Goal Attainment Follow-up Guide. 



6 



10 



f. Reaching arbitrarily established levels, 
regardless of expectations or change. 

g. Are rules and procedu^^es established for 
the follow-up scorer? 

h. How often are scales unscoreable and 
how confident is the follow-up scorer 
in the accuracy of his score? 

B4. The way the Goal Attainment score results 
are used. 

a. Who receives the results, supervisors, 
no one, clients, treaters, legislators? 

b. What are the consequences of the re- 
sults, i.e., information, salary, 
employment, advancement, publicity, 
peer pressure, etc.? 

c. How long will the results be received 
and how many times or how often? 

d. Will the results be used in conjunc- 
tion with different measures, i.e., 
costs? 

This array of variables illustrates the immen- 
sity of the validation task for Goal Attainment 
Scaling, and this list is not necessarily complete. 
There are thirty-seven components listed here, some 
of which contain multiple variables. 

In addition to the three characteristics and 
four activities listed previously, a related char- 
acteristic which should be discussed here is the 
effect of the experimental design in which Goal 
Attainment Scalinq is being applied. It should be 
understood that (ioal Attainment Scaling is a mea- 
surement tool which can be utilized within a range 
of desired degrees of formal scientific or experi- 
mental procedures, such as control groups, random 
assignment, impartial interviewers, etc. Such ex- 
perimental procedures are part of the total matrix 
of the utilization of Goal' Attainment Scaling but 
are not necessarily included in the list of Goal 
Attainment ScaVing activities and characteristics. 

Other variables are available for some pur- 
poses as partial comparative criteria. These in- 
clude a variety of commonly utilized outcome cor- 
relates and are other methods for measuring what 
is usually assumed to be some segments of "out- 
come". Goal Attainment Scaling is designed to be 
more comprehensive and more sensitive to client 
outcome than such measures (Kiresuk and Sherman, 
1968). Available criteria include: 

1. Minnesota Multiphasic Personality Inven- 
tory (MMPI) and other personality measures. 

2. Intelligence Quotient (IQ) results. 

3. Brief Psychiatric Rating Scale. 

4. Self-Rating Symptom Scale. 

5. Consumer satisfaction scores. 



6. Therapist ratings of global Improvement. 

7. Differences among groups receiving dif- 
ferent treatments. 

8. Taylor Manifest Anxiety Scale. 

These eight potential criteria, plus the thirty- 
seven components of Goal Attainment Scaling present- 
ed above, total at least forty-five components fo** 
the study of Goal Attainment Scaling construct 
validity. Obviously, many other criteria could be 
utilized, further expanding the list of components 
which could be examined. 



I I . Generating and Representing Hypotheses about 
Goal Attainment Scaling 

Clearly, not all the possible relationships 
among these activity and characteristic compo- 
nents of Goal Attainment Scaling can be repre- 
sented, but a cross section of hypothetical links 
can be suggested for some of the more important 
components. Predicted relationships among the 
components will be represented by arrows (^«>) 
with a superscript of "high" for a predicted cor- 
relation of over (.70), "moderate" for a predicted 
correlation of (.40) to (.69), "low" for a pre- 
dicted correlation of (.39) to (.15) and "no" 
for a correlation of (-.14) to (.14). Positive 
correlations are "+" and negative correlations 
are "Crit" refers to one of the eight cri- 
terion variables, but other codes refer to the 
Goal Attainment Scaling components listed pre- 
viously. 

Construct and variable relationships can be 
hypothesized and represented for Goal Attainment 
Scaling theory with this system of activities, 
characteristics, and criteria. One of the dif- 
ficulties wi th the construct validity approach, 
however, is that standards for the number and 
nature of the links required in the nomological 
network have not been clearly established. In 
addition, since it could be assumed that change 
is expected over time for "outcome" measures, there 
are actually different constructs with different 
sets of variable relationships when there are dif- 
ferent lengths of time between follow-up guide con- 
struction and follow-up scoring. For example, al- 
though age would not be expected to be associated 
with the outcome construct of Goal Attainment if 
the follow-up is completed inside a year or so, age 
could have some effect if the follow-ups were com- 
pleted only after five years and either children 
or elderly persons were included as Figure I illus- 
trates. 



FIGURE I 
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These and other theoretical considerations Imply 
that any summary comments about the construct 
validity and Goal Attainment Scaling should be 
cautious. 

The hypotheses associated with Goal Attain- 
ment Scaling are still fairly rudimentary and 
only a few have been tested extensively. The 
most firmly established hypothesis is that Goal 
Attainment scores are relatively unresponsive 
to the demographic characterls^.ics of either in- 
dividuals or groups of clients. This lack of re- 
lationship was predicted because the very pro- 
cess of setting the Goal Attainment Scaling 
levels Is based on developing expectations which 
should allow for the unique features, demographic 
or otherwise, of the client. The evidence in 
relation to other hypotheses, as discussed below, 
is somewhat sparser. The collection and analysis 
of data relevant to the hypotheses continues. 

The original Kiresuk and Sherman article of 
1968 predicted that the Goal Attainment score 
should have a low to moderate correlation with 
already existing outcome measures. Since the 
Goal Attainment score is 1) based on an Indivudal- 
ized measurement system, it should not have a 
high correlation with non-individualized measures, 
and 2) since it is specific and goal-oriented, it 
should differ from global or change-oriented 
measures even if they are individualized. 

Sections of the Hypthetical Nomological Not for 
Goal Attainment Scaling. 



FIGURE lib: 




FIGURE lie: 




(Accuracy of the 
follow-up guide 
construction 
predictions) 



FIGURE Ila: 



((Characteristics of fol- ) 
Uow-up guide constructor)/ 





(Relationship of 
tne follow-up guide 
constructor and client) 



<Jvanabnit> ^ 
of follow-up ] 
guide during J 



Figure Ila illustrates several hypotheses that 
could be tested. For example, the part of the dia- 
gram in the upper left quadrant of Figure Ila ad- 
vances the hypothesis that the Goal Attainment score 
mean, as a representative of the "outcome" construct, 
(B3a), should have a correlation with criterion 6. 

Figure lib and lie are also representations of 
inter 'ariable relationships. Figure lie suggests 
that the mean Goal Attainment score should be slightly 
related to the accuracy of predictions by the follow- 
up guide constructor and to the relationship of the 
follow-up guide constructor to treatment. 



III. Validity Measures for Goal Attainment Scaling 

In the following section, this representational 
system is used to discuss some hypotheses about Goal 
Attainment Scaling with pertinent data. 

The three major areas of test "quality" estab- 
lished by Ebel will be used to separate the presen- 
tation of these components' interrelationships into 
smaller units for ease of discussion. Ebel's points 
were presented previously on page 3. 

Examples will be presented in relation to each 
of these points. 



A. "The Importance of the Inferences that Can Be 
Made" 

Clearly if a criterion of effectiveness in 
mental health treatment could bo established, it 
would be a very significant contribution to eval- 
uation and psychopathology knowledge. 
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Specific examples of investigating treatment 
effectiveness include: 

1. Differences Among Four Outpatient Treat- 
ment Ntodes 



TABLE I: Outcome Scores for Clients Who Stayed in 
Assigned Treatment Mode at Least Half of 
Their Treatment Sessions for the Four Mode 
Study 



In a group clients randomly distributed 
among treatment modes, the mean Goal Attainment 
score should very among modes which hf,ve different 
degrees of effectiveness. 

FIGURE III 



B2a 
Type of 
Treatment 



B3a 

Differences in Mean 
Goal Attainment Scores 



The results in the earliest study using Goal 
Attainment Scaling suggested that differences 
among modes were not great when all clients were 
considered as a group regardless of other vari- 
ables. On June 5, 1972 for 186 nonrandomly a 
assigned subjects, the four modes of therapy in 
the original Program Evaluation Project study 
varied only from 48.7 for Day Treatment to 50.2 
for Individual Therapy, 50.3 for Drug Clinic to 
51.1 for Group Therapy. (The procedures for this 
study are discussed in other P.E.P. Report 1969- 
1973 chapters. ) 

Total results for all 249 randomly assigned 
clients as of 1973, reveals even less variation 
among treatment modes in terms of Goal Attainment 
score. 



Treatment Z 
Treatment Y 
Treatment X 
Treatment W 



50.0 
50.2 
50.3 
50.8 



These means are not statistically different even 
at the p < .10 level and certainly there is little 
clinical significance to such miniscule differ- 
ences. (The above data are based on assigned 
treatment modes, not actual treatment patterns.) 

More recent data for the randomly assigned 
cases in this four-mode study of the Program 
Evaluation Project, whose procedures are de- 
scribed elsewhere in the P.E.P. Report 1969-1973 , 
are slightly more encouraging , as Table I suggests, 
when separated by randomization pattern. 

There are, however, no Goal Attainment score 
differences which reach the p < .10 level. It is 
noteworthy that the Consumer Satisfaction Index 
differences in means also do not reach this level 
of statistical significance. It is not clear, 
however, that these treatment would be expected 
to be significantly different in outcome, espe- 
cially since the meaning of these various modes 
were not rigorously defined. 







TRtAT;4£flT U 


. TPiATHEMT X 


, TREAL^EMT Y 


TR£ATHE?fT 2 


PATTEWl 


Hc«n 
Goil 

A:t<1nnent 
Score 


47.34 
(N-3) 


52.22 


48.82 
(H-3) 


(N-4) 


I 


Cnns uoer 

SatfsfacUon 

Index 


57.14 
(H-3) 


67.77 
(V3) 


55.55 
(H-3) 


81 .54 
(N-4) 




Mean 
Coal 

AtUfnoent 
Score 


47.27 
(H«6) 


41.41 


48.97 
(N-9) 
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Mean 

Consumer 

SJtlsfactlon 

Index 


63.09 
{N.6) 


68.97 
(N-7) 


67.45 
(N-9) 




j PAT TERM 


M4an 

Goal 

Atuinmnt 
Score 


52.65 
(fClO) 


49.04 

(V9) 




56.32 
(H-1*) 


III 


Kean 

ConsuiDN' 
Satisfaction 


77.81 
(HMO) 


76.19 
(N-9) 




76.36 
(H-I4) 


PATTERfl 


M4an 
Goal 

Attalimnt 
Scort 


S2A9 
(fM19) 


52.89 
(H-62) 






IV 


Mean 

ConsuMr 

Satisfaction 

Index 


60.29 
(K-118) 


72.96 
{n-61) 







2. Two Treatment Modes at a Day Treatment 
Center 

Another study involving randomization of 14 
clients between two therapy modes, however, has 
shown statistically significant differences. In 
the Day Treatment Center of the Hennepin County 
Mental Health Service, half of the clierits were 
randomly assigned to a situation in which they pre- 
pared their own Goal Attainment Follow-up Guides, 
while the other half of the clients did not directly 
set goals for themselves. A clinician constructed 
standard Goal Attainment Follow-up Guides for both 
groups. Based on follow-up interviewer ratings of 
these clinician-prepared follow-up guides, the 
clients who were involved in their own goal-setting 
had a mean Goal Attainment score of 71, while the 
clients who did not set goals had a mean Goal Attain- 
ment score of 59, a difference significant at the 
p < .015 level (two-tai led) . In addition, clients pre- 
paring their own follow-up guides reported greater 
consumer satisfaction with the significance of the 
difference in means ranging, in terms of specific 
questions, from p < .05 to p < .10. 



3. Client Characteristics and the Goal Attain 
ment Score 

In constrast to the treatment comparisons made 
in points number 1 and number 2 above, riiost non- 
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treatment variables should not be related to the mean 
Goal Attainment score. The individualized develop- 
ment of the Goal Attainment Follow-up Guide should 
take into accouht such client-specific differences 
when the expectations on the follow-up guide are set. 

FIGURE IV 



TABLE III: Judgments on a Standardized Information 
Form Correlated 



'Client's Demographic^ 
Variables 
Alb, Ale, Aid 



No 



Mean Goal 
Attainment 
Score 
B3a 



This no-difference example is significant since 
it (a) suggests, by contrast, the importance of Goal 
Attainment score differences due to different degrees 
of treatment effectiveness and (b) indicates that the 
Goal Attainment Follow-up Guide can be specially 
adapted to each client, which is a key assertion of 
Goal Attainment theory. This portion of Goal Attain- 
ment Scaling validation is strongly supported. In 
January of 1972, a series of linear regressions were 
calculated for the relationship between the Goal At- 
tainment score of 50.55. (Baxter, Tripp, 1972) The 
correlations did not reach the p<.10 level of sta- 
tistical significance. The highest correlation was 
for income measures, which were correlated .20 and 
.18 with the 3oal Attainment score. See Table II. 



TABLE II: 



Correlations with Goal Attainment Score 
Mean (Pearson Product Moment) or Equivalent 



AGE 


.034 


SEX 


-.049 


MARITAL STATUS 


.01 


NUMBER OF CHILDREN 


.075 


"IS INCOME ADEQUATE" 


.111 


INCOME SOURCE 


.184 


INCOME AMOUNT 


.204 


HIGHEST GRADE COMr^^ETED 


.054 



Presenting problems mentioned by the clients 
also had low statistically non-significant correla- 
tions with Goal Attainment Scaling according to the 
data in that study. Some of these presenting prob- 
lem variables are dichotomous or polychotomous and 
were correlated accordingly. A few examples are 
presented here in Table III. 



Severity of Problem (N = 193) 


.102 


Thoughts of Suicide (N = 178) 


.153 


"Are Meds being Taken Currently?" 

(N = 158) 


.142 



A June, 1972 report by Baxter and Tripp 
mentioned before suggests that even cross-analysis 
by age and sex does not lead to statistically dif- 
ferent (at the p < .10 Ivel ) mean Goal Attainment 
Scaling scores, as shown in Table IV. 

TABLE IV: Goal Attainment Scores by Sex and Age-Group 
for 186 Clients 



Sex 


Young 


Old 


Male 


50.11 


53.00 


Female 


50.17 


50.57 



Thus, in these samples Goal Attainment scores 
seem to be largely independent of client character- 
istic variables. 

A similar find is suggested by data re- 
leased on September 15, 1971 resulting in a low 
correlation between Shipley-Hartford Intelligence 
scores and the Goal Attainment scores. (Meade, 1971) 



B. "The Meaningful ness of the Test Scores" 

This is Ebel's second major aspect of validity. 
He includes four subpoints, each of which is dis- 
cussed below. 

1. Operational Definition of the Measure- 
ment Procedures 

A large amount of operational flexibility is 
basic to Goal Attainment Scaling. This method- 
ology is not a single, rigidly determined set of 
procedures, but a collection of guidelines from 
which procedures may be, within some limits, es- 
tablished to fit the needs of each agency. The 
various Project publications are intended to out- 
line a range of approaches and to describe what 
has been done in the Project's research. The pub- 
lications are not designed to establish definitive, 
permanently fixed procedures, but to allow a range 
of implementations of Goal Attainment Scaling. 
(See P»*ogrammed Instruction in Goal Attainment 
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Scaling, Garwick, 1973 and chapter one of the 
?.£.?. Report, 1969-1973 ,) 

2. Knowledge of the Relationship of the 
Scores to Other Measures through Co- 
efficients 

This topic is the second subpoint within Ebel's 
second general area, "The meaningfulness of the 
test scores". This aspect of a measurement system 
would coniDonly be called "concurrent validity", 
i.e., validation through other measurement devices. 
In this instance, the "other measures" used for 
comparison should be other program evaluation or 
treatment outcome measures, as opposed to the 
client characteristic measures discussed earlier. 

As suggested earlier. Goal Attainment scores 
are not intended to have a particularly high cor- 
relation with other treatment outcome measurement 
devices, since Gosl Attainment Scaling is such a 
radically different evaluation system (Kiresuk and 
Sherman, 1968). 

a. Concurrent Validity in the Drug Effective- 
ness Study of tne Campbell -Fiske Matrix 

The predicted relationships within some of the 
major outcome measures of the Program Evaluation 
Project "Drug Effectiveness Study", appear below in 
a part of the nomological net. 



FIGURE V 




The actual results, based on the first block 
of 20 cases completed for the Drug Study appear in 
Table V in the following modified Campbell -Fiske 
(1959) matrix. This matrix is based on data sup- 
plied by Baxter and Jones (1973). 

This application of the matrix is modified 
from the Campbell-Fiske concept by the assumption 
that the clients' outcomes at a given time are the 
equivalent in the matrix of a "trait". For example, 
initial status is one trait, outcome at three weeks 
after intake is another trait, outcome after two 
months is the fourth trait. There are four vari- 
ables to measure these traits, except that the 
Taylor Manifest Anxiety Scale was not administered at 
at the three week's follow-up. (The places on the 
matrix which are left blank (because the Manifest 
Anxiety Scale was not administered at the three 
week follow-up are marked by a capital "X".) Thus, 
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each of the four measurement points in the Drug 
Effectiveness Study are utilized in the matrix 
in the same relationship as a personality trait 
would be utilized in the original conception of 
the matrix. The matrix should be interpreted, 
of course, in terms of the construct relationships 
expected for Goal Atta^-'riment Scaling. 

The concurrent or "convergent" validities are 
representevi on the rjatrix by the numbers not in- 
cluded in the triangles. As predicted, correlations 
between Goal Attainment Scaling and the other mea- 
sures are genera r<y low. Except for the correlation 
between Goal Attainment Scaling and the Manifest 
Anxiety Scj.'v the two month follow-up which was 
.52, the coiT/ilations are all below .30, with seven 
below .20. Only the ,52 correlation reaches the 
p < .05 level of significance (two-tailed). By con- 
trast, for the ten concurrent validity correlations 
not involving Goal Attainment Scaling, (for example, 
the correlations between the Taylor Manifest Anxiety 
Scale end the Self-Rating Symptom Scales, which is 
.73 at the initial measurement, .74 at the two month's 
follow-up, and .80 at the three month's follow-up) 
eight are over .30, with five over ,50 and three 
over .70. 

Tha first type of "discriminant validity", 
which requires that convergent validities be larger 
than correlations between different measures, in 
Wiggins' discussion of the Campbell and Fiske Matrix. 
(Wiggins, 1973) This validity criterion is not met 
consistently in this study by any of the outcome mea- 
surement devices, and particularly not by the Goal 
Attainment Scaling data, suggesting that correlations 
between two different outcome measures taken at a 
single follow-up time are not consistently higher 
than correlations involving different measures or 
different follow-up times. Similarly, the second 
and third types of discriminant validity are not 
met, suggesting that correlation in outcome due 
to "method" variance (which in this case is not 
really "method" variance bu* rather should be 
called follow-up variance) is as great or greater 
than correlation due to different follow-up times. 

The solid triangles illustrate the "reliabil- 
ity" correlations. Since in reality, all these 
outcome measures are based on three different 
time :.ome change might be expected in the rela- 
tive performance of the clients. Out of the six 
Goal Attainment Scaling correlations, four are 
below (.20), while for the 15 correlations from 
the other measures, only two are less than (.30), 
11 are greater than (.50) and six are greater 
than (.60). This pattern of outcome reliability 
suggests that the Goal Attainment score Is less 
stable than the other outcome measues. Whether 
or not this smaller degree of stability repre- 
sents a greater sensitivity to real shifts in 
outcome will have to be determined by future 
studies. The Drug Effectiveness Study is con- 
tinuing and later data me^y help illuminate the 
situation. 

b. Concurrent Validity and the Consumer 
Satisfaction Index 

Consumer satisfaction results have also been 
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TABLE V: An Outcome Evaluation MuUitrait-Multimethod Matrix for the Drug Effectiveness (Valium versus Psycho- 
therapy) Study 

i 



(N=20) 




Kethod 1 
(Goal Attainment Scaling) 


Method I 

(Brief Psychiatric Rating Scale) 


Kethod 3 
(Self Rating Symptom Scale) 


Heipod 4 
(Manifest Anxiety Scale) 




Traits 


Initial 


3 

Weeks 


2 

Months 


6 

Months 


Initial 


3 

weeks 


2 

Months 


6 

Months 


Initial 


3 

Weeks 


2 

Months 


6 

Months 


Initial 


3 

Wueks * 


2 

Months 


6 

Months 


Method 1 
iCodl Attaln- 
mnt Scaling) 


Inl tial 
3 Weeks 
2 Months 
6 Months 




-.12 
-.22 


.20 








































Method 2 
(Bfief Psy- 
chiatric 
Rating 
Scale) 


Initial 
3 Weeks 
2 Months 
6 Months 




-.22 
^.05 


-.30 

.07 
.11 


.17 
-.20 
..26" - 
-.23- - 


.28 1 
.08 • 

■ -.12 : 

^.06* 




.52 
-.06 


.75 


Tso^ 






































Method 3 
(Self Rating 
Syniotom 
Scale) 


Initial 
3 Weeks 
2 Months 
6 Months 




.2r - 

.12 
.19 


.2B - 
.16 


-.13 
--.27 
-,23"* - 

-.3r - 


.24 !• 
.10 , 
-.09 ' 




r-.47 ^ ' 
.25^ - 
.43 
.04 


^ .2'0 1 
^ .55- 

.60" 

.51 


.37 
--.J7 
- .5P 
.2(J - 


" ^20'i 
.38 1 
- -.33 1 




.66 
.47 


Tio^ 

.52 






































Method 4 
(Manifest 
Anxiety 
Scale) 


Initial 
3 Weeks* 

2 Months 
6 ycnthi 


--.15 ^ 

! « 
»-.30 

LJ1_ 


--J5 

-.16 - 
-.05_ _ 


-.27 

*- 5*2 - " 
-.34' - 


.25 1 
^.33 1 




X " - 
.27 
.11 


- -.44 

728- 
.51 


.52 

- .38" 
.2^ - 


. X 1 

- ^05 J 
^ ^20- 


1 

1 


^.73 
X — " 
.84 
.61 


X ~* 

.69 


.52 

■ X 

.-..74 - 
.7^ - 


.18 1 

^ 1 
- .42 ) 

-_8b - 




x" 
.84 
.49 


X ■ 

X 


















1 













• The Manifest Anxiety Scale was not scored at the two week follow-up. 



compared with the Goal Attainment score. In a 
February 1974 report, Oreyer noted a Pearson- 
Product Moment Correlation between Baxter's seven 
item Consumer Satisfaction Index and Goal Attain- 
ment score of .21 for 686 followed-up cases. In 
a 1973 report, Baxter presented correlations be- 
tween the Goal Attainment score and 12 of the 
various individual items on the Consumer Satis- 
faction inventory for 202 randomly assigned 
clients. These 12 correlations ranged from -.12 
to .46, with four over .20. This report also 
showed the Goal Attainment score and the Con- 
sumer Satisfaction Index which is based on seven 
items to be correlated .23 (for 199 clients). 

c. Concurrent Validity and Therapist Ratings 

A useful addition to the Goal Attainment Scal- 
ing validation data is the combination Validity/ 
Reliability Study by Baxter. (Baxter, 1973) In 
this study, therapists were asked to complete 
three procedures for clients followed up after 
April 15, 1972: 1) answer two global rating ques- 
tions about a client's progress in therapy, 2) 
score the client's Goal Attainment Follow-up 
Guide before seeing the scores from the follow-up 
interviewer, and 3) rate the "relevance, "optimism" 
versus "pessimism", and need for additional scales 
for each client's scales. The global ratings and 
the relevance, optimism/pessimism, and need for ad- 
ditional scales are valuable concurrent validation 
procedures. 

It would be expected that the correlations be- 
tween Goal Attainment scores and the individual 
global ratings would be somewhat higher than between 
Goal Attainment scores and more standardized measures 
such as the BPRS or SRSS as illustrated in Table V. 
Table VI shows the recorded Pearson Correlations for 
the Validity/Reliability Study. When therapist glo- 
bal ratings and Goal Attainment scores compare, the 
correlations range from .582 to .849. 



TABLE VI 





"IndTCdte how well. In your opinion. 
edCh pdtient did in reldtion to tnc 
typUal patient in your cfts^loid. . ^" 


"Indicate hOK Successful, 'n your 
opinion, wi% your interaction with 
this Cfttitnt in reUtlon to tie 
typical patient in your Caseload..." 




Outpatient 
N - 53 


Day Treatment 
H • 8 


Outpatient 
N - 53 


Day Trcatinent > 
N > 6 i 


Correlation 
with Thera- 
pist Follow- 
up Score 


.582 


.689 


.604 


.649 


Correlation 
with Follow- 
up Interview- 
er Foll0«'UD 

Score 


.325 


.192 


.319 


.507 



d. Concurrent Validity and the MMPI 

Mauger has found a Pearson Correlation of .285 
between the Goal Attainment score and the MMPI mean- 
change Index, and a correlation of .306 between the 
Goal Attainment change score and the MMPI mean-change 
score. Neither coefficient reached the p<.05 level 
of statistical significance. 

e. Concurrent Validity and Predictive 
Accuracy 

In an analysis of data for the forty-four cases 
in the original reliability study (see the P. E. P. Re- 
port, 1969-1973 chapters on follow-up and reliability) 
a number of measures were investigated (Twedt, 1974). 
In this study, for each client, one follow-up guide 
was constructed by the intake interviewer and a sec- 
ond follow-up guide was constructed by the therapist. 
Then, these two follow-up guides were combined and 
this combined follow-up guide was scored independ- 
ently at two different interviews by two different 
interviewers. 
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For the therapist-constructed follow-up guides, 
the Goal Attainment score was correlated only .18 
with the Consumer Satisfaction Index, but was cor- 
related .31 (significant at the p < .05 level) with 
predictive accuracy of the follow-up guide (i.e., 
mean inaccuracy score, absolute deviation for the de- 
viation for the expected mean of 50). There was no 
correlation which reached the p < .05 level cr sig- 
nificance between either the Goal Attainment score 
for the therapist or for the intake interviewer 
follow-up guides and variables of age, sex, or num- 
ber of treatment sessions with the clients, al- 
though intake interviewer Goal Attainment score was 
correlated -.23 with age of the client. For the in- 
take interviewer constructed follow-up guides, the 
Goal Attainment score was correlated only .24 with 
whe Consumer Satisfaction Index and .16 with predic- 
tive accuracy of the follow-up guide. Thus, the pre- 
dictive accuracy was correlated significantly with 
the Goal Attainment scores for the therapist but not 
for the intake interviewer, suggesting that the pre- 
dictions could have influenced the course of ther- 
apy in this case, which is very compatible with the 
fact that intake interviewers were not involved in 
the therapy directly. 



scores in the first and second interviews was 
.595 (N=S9). When the discipline of the inter- 
viewer was: held constant, the correlation was 
.623 -^r r'SW interviewers (N=13) and .750 for 
RN iinlHr.iewers (N=10). When the discipline of 
the rnierviewer in follow-up one was different 
than that of the interviewer in follow-up two, the 
correlations were slightly lower. 

b. The Combination Validity/Reliability Study 
mentioned previously gives another viewpoint on 
Reliability (Baxter, 1973). Here, therapist and 
the follow-up interviewers scored the Goal Attain- 
ment Follow-up Guides independently. Again, this 
is an extremely severe test of reliability, since 
there was a time span between the two interviews, 
and the sources of information were clearly much 
different for therapist and follow-up interviewees, 
Nevertheless, as Table VII illustrates, the cor- 
relations ranged from .46 to .85. 

TABLE VII 



All these results suggest that Goal Attainment 
Scaling is not highly correlated with other mea- 
surement systems. Nonetheless, there is a posi- 
tive correlation with the other systems, and the 
correlations are in the anticipated low to mod- 
erate range. 

The "Change score" is another possible concur- 
rent measure. Present data suggest moderate cor- 
relations between the Goal Attainment score and 
the Goal Attainment Change score in the .10 to .30 
range for various groups. (See the chapter on the 
Change score in the P.E.P. Report, 1969-1973. ) 

3. Reliability Measures 

Reliability, Ebel's third subpoint in his 
second general area, "The Meani ngful ness of the 
Test Scored", has been investigated repeatedly 
for Goal Attainment Scaling. It is not clear 
that his logic is correct when he included re- 
liability under the "meaningful ness" rubic, but 
it is included here in order to follow his out- 
line. Extensive discussion of the reliability 
results are available in the chapter on Re- 
liability and the Goal Attainment Scaling 
Methodology in P.E.P. Report 1969-1973 . 

a. One such study, the Interdisciplinary Re- 
liability Follow-up Study, was completed late in 
1972. (See the Follow-up Chapter in the P.E.P. 
Report, 1969-1973 .) The study examined differ- 
ences between interviewer discipline, telephone 
versus in-person interviewing, and the first 
versus the second interview. This arrangement 
of repeated trails will tend to minimize the es- 
timate of reliability, since follow-up interview- 
ers and other variables can change in the inter- 
val between the two interviews (a mean of 27 days). 
(See the chapter on Reliability of Goal Attain- 
ment Scaling in the P.E.P. Report. 1969-1973.) 

Even in these demanding circumstances, the 
overall Pearson Correlation for Goal Attainment 
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Pearson Product-Moment Correlation fcctwcEi c-ia] Attdtnrrcnt Scores for 
clients based on Independent (1) tnerapiit-scorings an*! (2) follow-gp 
interv1e*<er scorings. 



Adult OPO U ' 53 


.507 


Ody Freatirent N • 8 


.848 


All Clients H - 61 


.458 



These various results suggest a fair degrra of 
reliability with internal consistency coefficients 
in the .60 range, and inter-rater agreement re- 
liability coefficients in the .45 to .75 range. 
There seems to be sufficient Goal Attainment score 
consistency to make the methodology useful in a 
number of situations. 

4. Appropriate Norms of Examinee Performance 

Ebel's fourth subpoint under his general area 
of "The Meaningful ness of the Test Scores", is some- 
what difficult to translate directly into Goal At- 
tainment Scaling terms. Perhaps the most useful 
data on "norms" would be the mean Goal Attainment 
score of 50 and standard deviation of about 10, 
where each agency is seen as an "examinee". As 
mentioned earlier, most agencies using Goal Attain- 
ment Scaling have achieved such nor-^s. 

Another possible approach to establishing norms 
is some type of content analysis. A beginning on 
such norms is available from "Expectations and Goals 
for Clients at a Community Mental Health Service". 
(Garwick and Lampman, 1972) This study shows some 
norms for quantifiable variables for the Mental 
Health Service, but does not examine norms from other 
agencies, although Goal Attainment Follow-up Guides 
from several other agencies have been collected, and 
will be further analyzed in the future. 



Technical Refinements, or the "Convenie nce of 
the Test" 

Ebel's thi.i:d major area of "test quality" re- 



fers to the ease of the test's implementation and 
interpretation. Some Improvements have been made 
in the convenience of Goal Attainment Scaling, most 
recently, Baxter's Conversion Key for Calculating 
Goal Attainment Scores from Unweighted Scores (Bax- 
ter (1973) and Garwick and Brintnall 's Goal Attain- 
ment Scaling Calculation Tables (1973). In general, 
however, the convenience of the methodology could 
be improved further. The Goal Attainment Scaling 
methodology seems to be Inherently attractive to many 
clinicians and administrators, but both (1) construc- 
tion of the Goal Attainment Follow-up Guide and (2) 
interpretation of the Goal Attainment score may ap- 
pear inconvenient to some. 

1. Convenience of ConstrucMcn of the Goal 
Attainment Follow-Up Guide 

Some persons have praised the Goal Attainment 
Scaling concept because it enables evaluation to be- 
come part of the therapy and can be incorporated in- 
to the interaction with the client. Others, how- 
ever, have complained of the difficulty of training 
personnel in Goal Attainment Scaling and in finding 
enough time to produce the Goal Attainment Follow-up 
Guides. One possible amelioration of this difficul- 
ty is the client construction of the follow-up 
guides, as Illustrated by the Guide to Goals, Format 
Oiie approach (Garwick, 1972). 

The "review" or "monitoring" of the Goal Attain- 
ment Follow-up Guide is a closely related and dif- 
ficult problem. As the Program Evaluation Project 
staff began accumulating Goal Attainment Follow-up 
Guides, it seemed that some follow-up guides pro- 
duced by the clinicians included clerical or logical 
or descriptive shortcomings which rendered follow-up 
scoring very difficult or questionable ("Manual on 
Follow-up Assessment", Garwick, et. al . 1972). Var- 
ious monitoring and follow-up guide revision tech- 
niques have been utilized, but none has been com- 
pletely satisfactory- due to clinician dislike 
of the monitoring and a limited empirical basis 
of the monitoring criteria. The monitoring is 
expensive for the evaluation staff, and evident- 
ly unattractive to the clinical staff, yet the 
inconvenience and cost of follow-up guides with 
severe errors can^be troublesome. Thus, review 
and revision of tlie Goal Attainment Follow-up 
Guide and the anx)unt of time required for train- 
ing and complr ng the follow-up guide are two 
of the more frequent complaints. 

The client-specific nature of the Goal At- 
tainment Scaling methodology, however, is ap- 
parently popular. One possibility for increas- 
ing clients Involvement is the "prograrmied" 
Guide to Goals, Format One which was mentioned 
previously. If this device makes it possible 
to allow clients to construct their own follow- 
up guides wit^; tT^inimal clinical supervision, the 
convenience Goal Attainment Scaling may be 
greatly increo^ed. One possibility for minimizing 
the need for review is a "Semi-standardized" sys- 
tem where idiosyncratic (i.e., client-specific) 
construction is retained but a catalogue of pos- 
sible Goal Attainment variables is used to decrease the 
the time required for follow-up guide construction. 



2. Convenience of the Interpretation of 
the Goal Attainment Score 

The Goal Attainment score is based on the 
formula from Kiresuk and Sherman's 1968 article. 
This formula seems fr:'in1dable to some. One of 
the earliest attempts to ease Goal Attainment 
score calculation was the release of a short, 
simplified description of the calculetion in 
simple steps. As mentioned above, Baxter has 
produced a conversion key and recently Sherman 
has suggested the possibility of simpler, scale- 
by-scale calculation of follow-up level score 
means for each follow-up guide. (See Chapter 
One of the P.E.P. Report. 1969-1973 .) A series 
of tables which have been developed, make it 
possible, to obtain the Goal Attainment score 
without calculation for Goal Attainment Follow- 
up Guides with from one to five scales (Garwick 
and Brintnall, 1973). 

Because of the newness of Goal Attainment 
Scaling, comnents on interpreting the Goal At- 
tainment score have been purposely restricted. 
In general, the Goal Attainment score is said 
to be "the degree to which expectations for 
outcome at some certain time are reached". 



Conclusion 

Too often, it is forgotten that the Goal At- 
tainment score is merely a tool, like the MMPI, 
the Strong Vocational Interest Blank, or the 
Wechsler Adult Intelligence Scale. Like any mea- 
surement tool, the Goal Attainment score is as 
useful as the desici or system with which it is 
utilized. The degree of scientific irreproach- 
ability or clinical completeness depends on the 
way the methodology is Integrated into a practical 
or experimental procedure. 
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